为Chromium浏览器内核贡献源码!- GSoC 2025 Proposal - Interaction to Next Paint (INP) subparts

最近有幸参与并入选了GSoC 2025,这篇文章是我给Chromium项目组的提案,针对Web Performance API,旨在优化浏览器内核对于INP性能指标的一些处理。

小红书📕👉【CQU大二 成功被选为Google GSoC贡献者 - g122622 | 小红书 - 你的生活指南】 😆 1zhytSvgu5XAp9W 😆 https://www.xiaohongshu.com/discovery/item/681e22300000000023011448?source=webshare&xhsshare=pc_web&xsec_token=ABqekbzfvoxIozXy36qJLYrAPjDUL5wByx4W95rEowRts=&xsec_source=pc_share

Pasted%20image%2020250406205948

Overview & Backgrounds

Brief introduction about me

I’m a sophomore studying Computer Science at Chongqing University, China. I have been self-studying computer science since I was 9 years old and have nearly four years of experience researching the principles and design philosophies behind the Chromium project.

Recently, I happened to be studying Web performance standards. I was very excited when I came across this project on GSoC; it feels like it was meant to be!☺️

The present situation of INP

Pasted%20image%2020250406164034

The newly added INP metric was incorporated into the CWV system last year (2024). This year, the Chromium team plans to assist developers in conducting an in-depth analysis of INP latency issues by introducing a reporting mechanism for “subparts” similar to that of the LCP metric. This approach will break down INP into sub-dimensions such as input delay, processing time, and presentation delay.

The Chromium team has already implemented preliminary measurements of these timings within the Event Timing API.

By modifying Chromium’s Event Timing API, fine-grained timestamps (including but not limited to timeStamp, processingStart, processingEnd) that originally existed only in the Renderer process are transmitted to the Browser process via Mojo IPC, and ultimately integrated into the UKM metric system and CrUX experimental dataset.

My understanding of how EventTiming works

This is my overall understanding of how EventTiming operates:

The EventTiming module is located in the third_party/blink/renderer/core/timing/ directory.

The EventTiming::TryCreate() is the entry point for event capturing and is primarily invoked in the following scenarios:

Pasted%20image%2020250331120614

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
EventDispatcher::Dispatch()
├─ EventTiming::TryCreate() // Create EventTiming object (whether successful or not is transparent to subsequent code)
├─ DOMWindowPerformance::performance()
├─ EventTiming::processing_start = Now(); // Get processing start time
├─ EventTiming::HandleInputDelay(..., processing_start) // Obtain some performance data and report it (such as FID)
└─ EventTiming::EventTiming(processing_start, ...)
└─ performance_->EventTimingProcessingStart(..., processing_start, ...)
└─ // ↑ Record the event's creation_time and processing_start
├─ target->DispatchEvent() // Trigger event callbacks
└─ v8::Function::Call() // Call V8 to execute the callback function. The C++ call stack is blocked here waiting for the JS call stack to return
└─ ~EventTiming::EventTiming() // RAII takes effect, automatically calling the destructor
└─ WindowPerformance::EventTimingProcessingEnd()
├─ // ↑ Record processing_end, and trigger data reporting through EnsureSendTimer
[IPC callback]
└─ WindowPerformance::ReportEventTimings(); // Reporting metrics like INP

The EventTiming adopts a non-intrusive design, making its destruction transparent to the main event dispatch flow. It automatically records the processing end time at destruction through C++’s RAII mechanism:

1
2
3
4
5
6
// event_timing.cc
EventTiming::~EventTiming() {
if (event_) {
performance_->EventTimingProcessingEnd(*event_, Now());
}
}

By the way, this design approach also leverages the LIFO nature of the stack to handle nested events (such as an input event triggered within a pointer event), ensuring that Start and End calls are sequenced correctly.

Pasted%20image%2020250406162629

The event dispatch process target->DispatchEvent(*pointer_event) will ultimately call EventTarget::FireEventListeners, which iterates over the obtained event_target vector table and invokes each event callback one by one.

  • If the listener is a JavaScript function (JSBasedEventListener), V8 is used to execute the JS callback here.
  • If the listener is a native C++ object (such as built-in event handlers), it directly calls the C++ method.

For the former case (JS listeners), in JSBasedEventListener::Invoke, triggering the JS callback is fully synchronous:

1
2
3
v8::TryCatch try_catch(isolate);
try_catch.SetVerbose(true);
InvokeInternal(*event->currentTarget(), *event, js_event); // Synchronously call JS function

InvokeInternal eventually executes the JS function through V8’s v8::Function::Call(). The C++ main thread waits synchronously here for the JS function to finish execution, during which it is blocked and unable to handle other tasks.

The procedure of passing INP data from the render process to the browser process:

Pasted%20image%2020250406183937

This class diagram illustrates : The event handling process starting from WindowPerformance ➡️ Metric calculations performed by ResponsivenessMetrics ➡️ Step-by-step reporting through the framework client layer ➡️ Data aggregation by PageTimingMetricsSender ultimately ➡️ Cross-process transmission using mojom structures

(the direction of the arrows indicates the direction of calls/data flow)


Tasks

1.Update PageLoadMetricsSender mojo struct

Mojo is a cross-platform IPC framework that was born out of Chromium to facilitate intra-process and inter-process communication within Chromium. I am very familiar with its principles.

Pasted%20image%2020250326221427

Pasted%20image%2020250326220537

Change to report each part of each event timing (with non-0 interactionID), rather than just a single duration value

The general idea: I think we can refer to some fields in the following C++ structure in performance_event_timing.h to customize the PageLoadMetricsSender.😉

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
// third_party/blink/renderer/core/timing/performance_event_timing.h

struct EventTimingReportingInfo {
uint64_t presentation_index; // Presentation index
base::TimeTicks creation_time; // Time when the event was created
base::TimeTicks enqueued_to_main_thread_time; // Time when it was enqueued to the main thread
base::TimeTicks processing_start_time; // Time when processing started
base::TimeTicks processing_end_time; // Time when processing ended
base::TimeTicks commit_finish_time; // Time when rendering commit finished
base::TimeTicks presentation_time; // Time of presentation
base::TimeTicks fallback_time; // Fallback time
base::TimeTicks render_start_time; // Time when rendering started
std::optional<int> key_code; // Key code (for keyboard events)
std::optional<PointerId> pointer_id; // Pointer ID (for pointer events)
bool prevent_counting_as_interaction; // Whether to prevent counting as an interaction
bool is_processing_fully_nested_in_another_event; // Whether the processing is fully nested within another event
};

Therefore, based on the above C++ code, the following metrics can be derived:
(This table clearly outlines each metric’s details. The metric I highlighted in bold is the most important 🚧)

Metric Name Calculation Formula Description
Enqueue Delay(Input Delay) enqueued_to_main_thread_time - creation_time Measures the delay from when an event is created to when it is added to the main thread queue. This can help identify if events are experiencing long waits before being processed.
Processing Delay processing_start_time - enqueued_to_main_thread_time The time difference from when an event is added to the main thread queue to when processing begins. It reflects the impact of the main thread’s busyness on event handling.
Processing Duration processing_end_time - processing_start_time The amount of time spent processing the event. This helps in understanding the efficiency of the event processing logic.
Render Preparation Time render_start_time - processing_end_time The duration from the end of event processing to the start of rendering. It may include the time required for style calculations, layout, and other preparatory work before rendering.
Render Duration commit_finish_time - render_start_time The actual time spent during the rendering process, from the start of rendering until the commit finishes.
Presentation Delay presentation_time - commit_finish_time The delay from when the commit finishes to when it is finally presented to the user. This part can reveal bottlenecks in the composition and display processes.
Total Interaction to Presentation Time N/A (Overall Time) This is the core metric of INP (Interaction to Next Paint), measuring the total time from event creation (creation_time) to presentation to the user (presentation_time). It can be used to directly assess the user’s interaction experience.

Below is original Code in components/page_load_metrics/common/page_load_metrics.mojom :

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
// components/page_load_metrics/common/page_load_metrics.mojom

// Metrics about general input delay.
struct InputTiming {

// The number of user interactions, including click, tap and key press.
uint64 num_interactions = 0;

// List of user interactions since last update.
// TODO(crbug.com/382949422): Remove the needless union wrapper.
UserInteractionLatencies max_event_durations;
};

// Data for user interaction latencies which can be meausred in different ways.
union UserInteractionLatencies {
array<UserInteractionLatency> user_interaction_latencies;
};

// The latency and the type of a user interaction.
struct UserInteractionLatency {
mojo_base.mojom.TimeDelta interaction_latency;
// The one-based offset of the interaction in time-based order; 1 for the
// first interaction, 2 for the second, etc.
uint64 interaction_offset;
// The time the interaction occurred, relative to navigation start.
mojo_base.mojom.TimeTicks interaction_time;
};

Modified Code :

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
// (modified by me) components/page_load_metrics/common/page_load_metrics.mojom

// New subpart data structure added
struct InteractionSubpartTiming {
// Input delay: Time from event trigger to start processing
mojo_base.mojom.TimeDelta input_delay;
// Processing time: Duration taken by listener execution
mojo_base.mojom.TimeDelta processing_time;
// Presentation delay: Rendering pipeline duration
mojo_base.mojom.TimeDelta presentation_delay;
// Compatibility field: Total delay should equal the sum of the three parts
// @deprecated Will be removed after gradual migration
mojo_base.mojom.TimeDelta total_duration;
};

// Modify user interaction latency structure
struct UserInteractionLatency {
// Retain original fields
mojo_base.mojom.TimeDelta interaction_latency; // Retained during transition period
uint64 interaction_offset;
mojo_base.mojom.TimeTicks interaction_time;

// Add subpart data
InteractionSubpartTiming subparts;

// Add interaction type identifier
// 0: click/tap, 1: keyboard, 2: drag etc.
uint8 interaction_type;
};

// Optimize union structure
union UserInteractionLatencies {
// Upgrade to an array carrying subpart data
array<UserInteractionLatency> detailed_interactions;

// Retain old format for compatibility during transition period
array<mojo_base.mojom.TimeDelta> legacy_durations;
};

// Enhance input timing structure
struct InputTiming {
// Split into basic statistics and detailed data
uint64 num_interactions;
UserInteractionLatencies interaction_details;

// Add performance baseline identifier
// 0: Uncalibrated, 1: Lab environment, 2: Real user data
uint8 measurement_quality;
};

I added InteractionSubpartTiming struct specifically to carry subpart timing data, while retaining existing UserInteractionLatency as a container struct, nesting new data through the subparts field

Update blink/Renderer side “plumbing” to migrate these values from window_performance.cc / responsiveness_metrics.cc to the PageTimingMetricsSender.

I am planning to move the logic into the function below, using my new mojo struct to delivery data.

Pasted%20image%2020250406184550

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
// components/page_load_metrics/renderer/page_timing_metrics_sender.cc
// this is how we eventually sent from Renderer to Browser
void PageTimingMetricsSender::DidObserveUserInteraction(
base::TimeTicks max_event_start,
base::TimeTicks max_event_queued_main_thread,
base::TimeTicks max_event_commit_finish,
base::TimeTicks max_event_end,
uint64_t interaction_offset) {
input_timing_delta_->num_interactions++;
metadata_recorder_.AddInteractionDurationMetadata(max_event_start,
max_event_end);
metadata_recorder_.AddInteractionDurationAfterQueueingMetadata(
max_event_start, max_event_queued_main_thread, max_event_commit_finish,
max_event_end);
base::TimeDelta max_event_duration = max_event_end - max_event_start;
input_timing_delta_->max_event_durations->get_user_interaction_latencies()
.emplace_back(mojom::UserInteractionLatency::New(
max_event_duration, interaction_offset, max_event_start));
EnsureSendTimer();
}

2.From Browser, update the UkmPageLoadMetricsObserver

For the changes in the UserInteractionLatency structure, I have provided the following scheme, implementing compatibility handling for the old and new data formats using the strategy pattern.😊

Original code :

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
// components/page_load_metrics/browser/responsiveness_metrics_normalization.cc

std::optional<mojom::UserInteractionLatency>
ResponsivenessMetricsNormalization::ApproximateHighPercentile() const {
std::optional<mojom::UserInteractionLatency> approximate_high_percentile;
if (worst_ten_latencies_.size()) {
uint64_t index =
std::min(static_cast<uint64_t>(worst_ten_latencies_.size() - 1),
static_cast<uint64_t>(num_user_interactions_ /
kHighPercentileUpdateFrequency));
approximate_high_percentile = worst_ten_latencies_[index];
}
return approximate_high_percentile;
}

std::optional<mojom::UserInteractionLatency>
ResponsivenessMetricsNormalization::worst_latency() const {
std::optional<mojom::UserInteractionLatency> worst_latency;
if (worst_ten_latencies_.size()) {
worst_latency = worst_ten_latencies_[0];
}
return worst_latency;
}

void ResponsivenessMetricsNormalization::AddNewUserInteractionLatencies(
uint64_t num_new_interactions,
const mojom::UserInteractionLatencies& max_event_durations) {
num_user_interactions_ += num_new_interactions;
// Normalize max event durations.
NormalizeUserInteractionLatencies(max_event_durations);
}

void ResponsivenessMetricsNormalization::ClearAllUserInteractionLatencies() {
num_user_interactions_ = 0;
worst_ten_latencies_ = std::vector<mojom::UserInteractionLatency>();
}

Modified code :

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
// components/page_load_metrics/browser/responsiveness_metrics_normalization.cc

// Add subpart delay calculation strategy
namespace {

using InteractionLatency = mojom::UserInteractionLatency;

// Abstract strategy interface
class LatencyCalculator {
public:
virtual base::TimeDelta GetTotalLatency(const InteractionLatency& latency) const = 0;
virtual ~LatencyCalculator() = default;
};

// New data format strategy
class SubpartsLatencyCalculator : public LatencyCalculator {
public:
base::TimeDelta GetTotalLatency(const InteractionLatency& latency) const override {
return latency->subparts->input_delay +
latency->subparts->processing_time +
latency->subparts->presentation_delay;
}
};

// Old data format strategy
class LegacyLatencyCalculator : public LatencyCalculator {
public:
base::TimeDelta GetTotalLatency(const InteractionLatency& latency) const override {
return latency->interaction_latency;
}
};

} // namespace

// Modify key functions
std::optional<InteractionLatency>
ResponsivenessMetricsNormalization::ApproximateHighPercentile() const {
if (worst_ten_latencies_.empty()) return std::nullopt;

const uint64_t index = std::min<uint64_t>(
worst_ten_latencies_.size() - 1,
num_user_interactions_ / kHighPercentileUpdateFrequency);

return ApplyLatencySelectionStrategy(worst_ten_latencies_[index]);
}

std::optional<InteractionLatency>
ResponsivenessMetricsNormalization::worst_latency() const {
return worst_ten_latencies_.empty()
? std::nullopt
: ApplyLatencySelectionStrategy(worst_ten_latencies_[0]);
}

void ResponsivenessMetricsNormalization::AddNewUserInteractionLatencies(
uint64_t num_new_interactions,
const mojom::UserInteractionLatencies& max_event_durations) {
num_user_interactions_ += num_new_interactions;

// Strategy pattern selector
auto strategy = max_event_durations.is_detailed_interactions()
? std::make_unique<SubpartsLatencyCalculator>()
: std::make_unique<LegacyLatencyCalculator>();

// Unified processing interface
auto processor = [this, &strategy](const auto& interactions) {
for (const auto& latency : interactions) {
base::TimeDelta total = strategy->GetTotalLatency(latency);
InsertSorted(worst_ten_latencies_, total, latency);
}
};

// Dispatch processing logic
if (max_event_durations.is_detailed_interactions()) {
processor(max_event_durations.get_detailed_interactions());
} else {
// Convert old data format
for (const auto& legacy_latency :
max_event_durations.get_legacy_durations()) {
InteractionLatency converted;
converted->subparts->total_duration = legacy_latency;
processor({converted});
}
}

// Maintain a maximum of 10 elements
if (worst_ten_latencies_.size() > 10) {
worst_ten_latencies_.resize(10);
}
}

// New private method
InteractionLatency ResponsivenessMetricsNormalization::ApplyLatencySelectionStrategy(
const InteractionLatency& latency) const {
InteractionLatency result = latency.Clone();

// When using new data format, write total latency to the compatibility field
if (latency->subparts.is_valid()) {
result->interaction_latency =
latency->subparts->input_delay +
latency->subparts->processing_time +
latency->subparts->presentation_delay;
}

return result;
}

void ResponsivenessMetricsNormalization::InsertSorted(
std::vector<InteractionLatency>& container,
base::TimeDelta new_latency,
const InteractionLatency& raw_data) {
auto it = std::lower_bound(container.begin(), container.end(), new_latency,
[](const auto& a, const auto& b) {
return GetTotalLatency(a) > GetTotalLatency(b);
});
container.insert(it, raw_data.Clone());
}

My modification approach :

Pasted%20image%2020250407112902

Complementary Test Cases:

(some objects and function such as MakeInteraction , HybridDataProcessing is defined by me. Due to limitations in article length, their definitions are omitted here 🥲)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
// components/page_load_metrics/browser/responsiveness_metrics_normalization_unittest.cc

TEST_F(ResponsivenessMetricsNormalizationTest, HybridDataProcessing) {
mojom::UserInteractionLatencies mixed_data;

// Add mixed old and new format data
mixed_data.set_detailed_interactions({
MakeInteraction(50ms, 30ms, 20ms), // Total 100ms
MakeInteraction(80ms, 10ms, 10ms) // Total 100ms
});
mixed_data.set_legacy_durations({base::Milliseconds(90)});

normalization_->AddNewUserInteractionLatencies(3, mixed_data);

EXPECT_EQ(3, normalization_->num_user_interactions());
ASSERT_EQ(3, normalization_->worst_ten_latencies().size());

// Verify old data conversion
auto& converted = normalization_->worst_ten_latencies()[2];
EXPECT_EQ(90, converted->subparts->total_duration.InMilliseconds());
EXPECT_NEAR(72, converted->subparts->input_delay.InMilliseconds(), 5);
}

Affected Modules & Files

blink/renderer/core/timing/window_performance.cc
blink/renderer/core/timing/responsiveness_metrics.cc

metrics_sender

components/page_load_metrics/renderer/page_timing_metrics_sender.cc
components/page_load_metrics/common/page_load_metrics.mojom

UKM

chrome/browser/page_load_metrics/observers/core/ukm_page_load_metrics_observer.cc
components/page_load_metrics/browser/responsiveness_metrics_normalization.cc
tools/metrics/ukm/ukm.xml


Development Process

Phase Weeks Deliverables Risk Control
Environment Setup & Prototype Verification 2 Chromium debugging environment, PoC for sub-dimension data collection Accelerate compilation using GCP instances
Mojo Protocol Modification 3 Extended IPC interfaces, modifications to rendering process Submit incremental CLs (Change Lists) and get code reviews in time
UKM Integration 2 New UKM metrics reporting, update ukm.xml Synchronously update validation rules of ukm.xml
Test Suite Development 2 Web Platform Tests/WPT cases, unit test coverage Utilize Chromium test framework
Performance Regression Testing 1 Benchmark reports, memory usage analysis Use Telemetry performance testing framework
Documentation & Wrap-up 2 Design documentation, CrUX integration plan Reserve buffer time

My Contributions to Other Open Source Projects

VSCode

Bytenode


About Me

I’m a sophomore studying Computer Science at Chongqing University, with a strong interest in browser internals. As a heavy GitHub user, I’m deeply familiar with collaborative workflows on the platform, including issue tracking, pull requests, and code reviews. I enjoy writing clean, maintainable code and have solid experience with design patterns and object-oriented programming.

Recently, I’ve been diving into the Chromium codebase, focusing on its architecture and core components. Through hands-on exploration, I’ve gained practical knowledge of the Mojo IPC framework and learned to utilize Web Performance APIs to analyze and optimize rendering pipelines. To solidify my understanding, I’ve built small experimental modules that interact with Chromium’s internals, though these are still works in progress.

My journey started when I became curious about how browsers render web pages efficiently. Books like How Browsers Work and Chromium’s official documentation became my go-to resources for self-learning. I’ve also documented some of my explorations through technical blog posts (written in Chinese), including an analysis of Chromium’s multi-process architecture and a tutorial on measuring page load metrics using PerformanceObserver.


Extra Info

Name: Yi Guo
Email: 20230503@stu.cqu.edu.cn
Github: github.com/g122622
Time Zone: UTC+08:00 (China)
Location: Chongqing, China